EDA

ENCODING CATEGORICAL DATA

LET'S TRY ANOTHER MODEL , We may return to catboost but later

1) LINEAR REGRESSION

LINEAR REGRESSION IS NOT A GOOD MODEL IN THIS CASE

2) POLYNOMIAL REGRESSION

POLY REGRESSION IS NO GOOD AS WELL...

3) SVR

The grid search SVR doesn't seem to be good enough even though it is much better than Linear regression and polynomial regression

4) KNN Regression

I tried KNN Regression for various values of n_neighbours, the best was n=3..but that's worse than SVR but better than poly regression

5) Random Forest

Random forest performed good but it is slightly less efficient than Catboost , but one thing is good that RF doesn't contain any negative values like CatBoost

IMPLEMENTING RF ON TEST SET

EARLIER WE TRIED ONE HOT ENCODING NOW LET'S TRY WITH DUMMY VARIABLE

USING K-FOLD CV ON cb model

So now it is showing an accuracy of around 93.55 percent on the test data , now let's implement it on the test data

USING THIS "GET DUMMIES" METHOD MY SCORE WAS 0.78